A Multi-media Approach to Cross-lingual Entity Knowledge Transfer
نویسندگان
چکیده
When a large-scale incident or disaster occurs, there is often a great demand for rapidly developing a system to extract detailed and new information from lowresource languages (LLs). We propose a novel approach to discover comparable documents in high-resource languages (HLs), and project Entity Discovery and Linking results from HLs documents back to LLs. We leverage a wide variety of language-independent forms from multiple data modalities, including image processing (image-to-image retrieval, visual similarity and face recognition) and sound matching. We also propose novel methods to learn entity priors from a large-scale HL corpus and knowledge base. Using Hausa and Chinese as the LLs and English as the HL, experiments show that our approach achieves 36.1% higher Hausa name tagging F-score over a costly supervised model, and 9.4% higher Chineseto-English Entity Linking accuracy over state-of-the-art.
منابع مشابه
Enhancing Multi-lingual Information Extraction via Cross-Media Inference and Fusion
We describe a new information fusion approach to integrate facts extracted from cross-media objects (videos and texts) into a coherent common representation including multi-level knowledge (concepts, relations and events). Beyond standard information fusion, we exploited video extraction results and significantly improved text Information Extraction. We further extended our methods to multi-lin...
متن کاملEnglish-Persian Plagiarism Detection based on a Semantic Approach
Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...
متن کاملLearning a Cross-Lingual Semantic Representation of Relations Expressed in Text
Learning cross-lingual semantic representations of relations from textual data is useful for tasks like cross-lingual information retrieval and question answering. So far, research has been mainly focused on cross-lingual entity linking, which is confined to linking between phrases in a text document and their corresponding entities in a knowledge base but cannot link to relations. In this pape...
متن کاملOverview of TAC-KBP2017 13 Languages Entity Discovery and Linking
In this paper we give an overview of the Tri-lingual Entity Discovery and Linking (EDL) task at the Knowledge Base Population (KBP) track at TAC2017, and of the Ten Low Resource Language EDL Pilot. We will summarize several new and effective research directions including multi-lingual common space construction for cross-lingual knowledge transfer, rapid approaches for silver-standard training d...
متن کاملCross-Lingual Cross-Document Coreference with Entity Linking
This paper describes our approach to the 2011 Text Analysis Conference (TAC) Knowledge Base Population (KBP) cross-lingual entity linking problem. We recast the problem of entity linking as one of cross-document entity coreference. We compare an approach where deductive entity linking informs crossdocument coreference to an inductive approach where coreference and linking judgements are mutuall...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016